from IPython.display import Latex
The interpretation of U-Th data relies in this case on a time evolution law of the U-series nuclides, which is the classical law used to account for the variations of the radioactive disequilibria during weathering, namely a loss and gain model , . In the frame of such a scenario the temporal evolution of the number of the atoms of the five 238U-234U-230Th-232Th-226Ra nuclides per gram of sample is described by the four following equations:
where $\lambda_i$ are the decay constants (in $y^{-1}$) of nuclides $i$ (here 238U, 234U, 230Th, 232Th and 226Ra), $k_i$ are the first-order rate constants (in $y^{-1}$) for leaching of nuclides $i$ and $f_i$ are the input fluxes (in $y^{-1}$) of nuclides $i$ gained by the regolith.
For simplification, the input fluxes are expressed as the proportion of the number of atoms of nuclides added per year to the initial sample (expressed as ${\lambda}_i N_0$).
Solving the above equation system allows to calculate the theoretical activities of the various nuclides in the sample at time t and the four independent activity ratios (234U/238U), (230Th/238U), (230Th/232Th) and (226Ra/???).
The nuclides activities and the activity ratios are a function of the mobility parameters ($k_i$, $f_i$) of the model, the initial activities of the different nuclides and the time $t$. The time $t$ corresponds to the time span between the initial state of the sample at $t=0$ and its present state.
Thus, the time $t$ corresponds to the time duration elapsed for a sample to move from the starting position to its current position. Usually the reference sample is collected at a locality upstream in the plain, which corresponds to the starting or reference position. It is important to note that such an approach implies that the transfer of sediments within the plain works at steady state.
The mobility parameters ($k_i$, $f_i$) are usually unknown; therefore, the analysis of the radionuclides and the related radioactive disequilibria in two samples are not sufficient to calculate the time $t$. To solve this problem a series of samples has to be collected in soil at different depths. Due to the complexity of the complete China sample, the complete set has been boxed into four sub-samples in wich the mobility parameters are considered constant.
The determined radioactive disequilibria of these samples are then used to determine (1) the mobility parameters of the different radionuclides and (2) the age of the different samples relative to the reference sample; this age corresponds to a production rate $p$ and is related to the time span necessary to move from the reference position to its current position.
The same statistical framework has been conducted for each profil and each parameter.
For fitting and computing probability density functions (PDF), we used Python packages scipy.stats, seaborn for statistical analysis, pandas to store/access large data frames and plotly for interactive graphic rendering. A hundred of bins was used to perform initial distributions.
Afterwards, a kernel density estimation (KDE) is applied to provide an estimate of underlying distributions. The KDE algorithm takes a parameter, bandwidth, that affects how "smooth" the resulting curve is. We use the same bandwidth $n^{-{1}/{5}}=0.0725$ where $n$ is the size of the population ($n \approx 500~000$).
Thus, obtained KDE curve has been fitted on a normal distribution (normalization) in order to compute mean $\mu$ and standard deviation $\sigma$.
In statistics, kernel density estimation (KDE) is a non-parametric way to estimate the probability density function of a random variable. Kernel density estimation is a fundamental data smoothing problem where inferences about the population are made, based on a finite data sample. In some fields such as signal processing and econometrics it is also termed the Parzen–Rosenblatt window method, after Emanuel Parzen and Murray Rosenblatt, who are usually credited with independently creating it in its current form. (source wikipedia)
Remarque : Un lien intéressant (en tout cas pour moi !) pour mieux comprendre sa construction How to build KDE function
import numpy as np
import pandas as pd
import seaborn as sns
from scipy import stats
import plotly.figure_factory as ff
import plotly.graph_objs as go
import matplotlib.pyplot as plt
import matplotlib.ticker as tick
from mpl_toolkits.mplot3d import Axes3D
%matplotlib inline
labels=['Profil 2 ','Profil 3a','Profil 4a','Profil 4b']
var = ['Time','P','k238','f238','k234','f234','k(234/238)','f(234/238)']
colors = ['blue','red','green','magenta']
Using pandas package to read .csv output results files:
# Load using Pandas module csv file
prof_2 = pd.read_csv('zone_2.csv',names=['ID_BEE','Fobj','Time','P','k238','f238','k234','f234','k(234/238)','f(234/238)','k238/f238','k234/f234'],encoding='utf-8',delimiter="\t",dtype=np.float64)
prof_3 = pd.read_csv('zone_3a.csv',names=['ID_BEE','Fobj','Time','P','k238','f238','k234','f234','k(234/238)','f(234/238)','k238/f238','k234/f234'],encoding='utf-8',delimiter="\t",dtype=np.float64)
prof_4 = pd.read_csv('zone_4a.csv',names=['ID_BEE','Fobj','Time','P','k238','f238','k234','f234','k(234/238)','f(234/238)','k238/f238','k234/f234'],encoding='utf-8',delimiter="\t",dtype=np.float64)
prof_5 = pd.read_csv('zone_4b.csv',names=['Fobj','Time','P','k238','f238','k234','f234','k(234/238)','f(234/238)','k238/f238','k234/f234'],encoding='utf-8',delimiter="\t",dtype=np.float64)
Store data into pandas.DataFrame container and read first 5 lines of each table:
# Store data in data frame object from Pandas module
df_p2 = pd.DataFrame(prof_2,columns=['Fobj','Time','P','k238','f238','k234','f234','k(234/238)','f(234/238)'])
df_p3 = pd.DataFrame(prof_3,columns=['Fobj','Time','P','k238','f238','k234','f234','k(234/238)','f(234/238)'])
df_p4 = pd.DataFrame(prof_4,columns=['Fobj','Time','P','k238','f238','k234','f234','k(234/238)','f(234/238)'])
df_p5 = pd.DataFrame(prof_5,columns=['Fobj','Time','P','k238','f238','k234','f234','k(234/238)','f(234/238)'])
# Print first 5 results
print("Profil 2 :",prof_2.head())
print(30*'---')
print("Profil 3a :",prof_3.head())
print(30*'---')
print("Profil 4a :",prof_4.head())
print(30*'---')
print("Profil 4b :",prof_5.head())
from IPython.display import display,clear_output
from ipywidgets import interact, IntSlider, Dropdown
val_stat = np.zeros((8,5,np.size(stats.t.fit(df_p2[var[1]]))))
#parameter_slider = IntSlider(min=0, max=4, step=1, value=0)
parameter_drop = Dropdown(options=[('Time y', 0), ('P m/My', 1), ('k238', 2), ('f238', 3), ('k234', 4), ('f234', 5),('k234/k238',6),('f234/f238',7)])
profile_drop = Dropdown(options=[('Profile 2', 0), ('Profile 3a', 1), ('Profile 3b', 2), ('Profile 4a', 3)])
# decorate the plot function with an environment from the UIs:
@interact(parameter=parameter_drop, profile=profile_drop)
def plot(parameter, profile):
fig, ax = plt.subplots(figsize=(15, 2))
xlim = [1.0E+6,4.0,1.0E-5,1.0E-5,1.0E-5,1.0E-5,4.,4.]
bins = 1 ; mean = 0. ; sigma = 1.
if(profile==0):
n, bins, patches = plt.hist(df_p2[var[parameter]], 400, density=1,color=colors[profile])
val_stat[parameter,profile,:] = stats.t.fit(df_p2[var[parameter]])
mean = val_stat[parameter,profile,1]
sigma = val_stat[parameter,profile,2]
if(parameter==0):
val_stat[parameter,profile,1] = np.mean(df_p2[var[parameter]])
val_stat[parameter,profile,2] = np.std(df_p2[var[parameter]])
if(profile==1):
n, bins, patches = plt.hist(df_p3[var[parameter]], 400, density=1,color=colors[profile])
val_stat[parameter,profile,:] = stats.t.fit(df_p3[var[parameter]])
#if(parameter==0):
#val_stat[parameter,profile,1] = np.mean(df_p3[var[parameter]])
#val_stat[parameter,profile,2] = np.std(df_p3[var[parameter]])
if(profile==2):
n, bins, patches = plt.hist(df_p4[var[parameter]], 400, density=1,color=colors[profile])
val_stat[parameter,profile,:] = stats.t.fit(df_p4[var[parameter]])
if(parameter==0):
val_stat[parameter,profile,1] = np.mean(df_p4[var[parameter]])
val_stat[parameter,profile,2] = np.std(df_p4[var[parameter]])
if(profile==3):
n, bins, patches = plt.hist(df_p5[var[parameter]], 400, density=1,color=colors[profile])
val_stat[parameter,profile,:] = stats.t.fit(df_p5[var[parameter]])
if(parameter==0):
val_stat[parameter,profile,1] = np.mean(df_p5[var[parameter]])
val_stat[parameter,profile,2] = np.std(df_p5[var[parameter]])
mean = val_stat[parameter,profile,1]
sigma = val_stat[parameter,profile,2]
ax.plot(bins, stats.norm.pdf(bins, mean, sigma))
ax.axvline(x=mean,c=colors[profile],ls='dashed')
ax.axvspan(mean-sigma, mean+sigma,alpha=0.3,facecolor=colors[profile])
ax.set_xlabel('Parameter : %s'%var[parameter])
ax.text(np.max(bins)/1.5,np.max(n)/2., r'$\mu$ = %e'%mean)
ax.text(np.max(bins)/1.5,np.max(n)/3., r'$\pm\sigma$= %e'%sigma)
ax.set_xlim(0.0,np.max(bins))
ax.set_ylabel('Probability')
ax.set_title('Profile : %s '%labels[profile])
print('%s'%labels[profile],'==> mean = %e'%mean,'| standard deviation ','= %e'%sigma)
plt.show()
| Parameter | Time (y) | $p$ (m/My) | $k_{238}$ (1/s) | $f_{238}$ (1/s) | $k_{234}$ (1/s) | $f_{234}$ (1/s) | $\dfrac{k_{238}}{k_{234}}$ | $\dfrac{f_{238}}{f_{234}}$ |
|---|---|---|---|---|---|---|---|---|
| Mean $\mu$ | $891~224$ | $2.207958$ | $5.373379~10^{-7}$ | $17.44775~10^{-7}$ | $9.595960~10^{-7}$ | $24.93313~10^{-7}$ | 1.977343 | 1.435569 |
| Std deviation $\sigma~\pm$ | $99~528$ | $0.2095690$ | $1.942053~10^{-7}$ | $2.301647~10^{-7}$ | $2.937391~10^{-7}$ | $3.146862~10^{-7}$ | 0.3963533 | 0.1365413 |
| Parameter | Time (y) | $p$ (m/My) | $k_{238}$ (1/s) | $f_{238}$ (1/s) | $k_{234}$ (1/s) | $f_{234}$ (1/s) | $\dfrac{k_{238}}{k_{234}}$ | $\dfrac{f_{238}}{f_{234}}$ |
|---|---|---|---|---|---|---|---|---|
| Mean $\mu$ | $303~840$ | $2.664715$ | $6.794840~10^{-7}$ | $20.56547~10^{-7}$ | $9.137099~10^{-7}$ | $23.59388~10^{-7}$ | 1.353190 | 1.150026 |
| Std deviation $\sigma~\pm$ | $39~600$ | $0.2348235$ | $1.704398~10^{-7}$ | $2.243455~10^{-7}$ | $2.345804~10^{-7}$ | $3.095209~10^{-7}$ | 0.2558826 | 0.1128398 |
| Parameter | Time (y) | $p$ (m/My) | $k_{238}$ (1/s) | $f_{238}$ (1/s) | $k_{234}$ (1/s) | $f_{234}$ (1/s) | $\dfrac{k_{238}}{k_{234}}$ | $\dfrac{f_{238}}{f_{234}}$ |
|---|---|---|---|---|---|---|---|---|
| Mean $\mu$ | $590~150$ | $1.846073$ | $24.17446~10^{-7}$ | $9.921260~10^{-7}$ | $30.49398~10^{-7}$ | $14.29396~10^{-7}$ | 1.287298 | 1.511653 |
| Std deviation $\sigma~\pm$ | $21~629$ | $0.5859635$ | $11.41665~10^{-7}$ | $7.806470~10^{-7}$ | $16.33527~10^{-7}$ | $11.73991~10^{-7}$ | 0.1414916 | 0.2558008 |
| Parameter | Time (y) | $p$ (m/My) | $k_{238}$ (1/s) | $f_{238}$ (1/s) | $k_{234}$ (1/s) | $f_{234}$ (1/s) | $\dfrac{k_{238}}{k_{234}}$ | $\dfrac{f_{238}}{f_{234}}$ |
|---|---|---|---|---|---|---|---|---|
| Mean $\mu$ | $791~312$ | $2.334735$ | $12.43773~10^{-7}$ | $5.232120~10^{-7}$ | $17.86370~10^{-7}$ | $7.210024~10^{-7}$ | 1.516394 | 1.610835 |
| Std deviation $\sigma~\pm$ | $158~785$ | $0.2347057$ | $2.733054~10^{-7}$ | $1.994557~10^{-7}$ | $3.112318~10^{-7}$ | $2.416315~10^{-7}$ | 0.2982813 | 0.5624723 |
# Create distplot with curve_type set to 'normal'
fig = ff.create_distplot([df_p2['P'],df_p3['P'],df_p4['P'],df_p5['P']],labels, bin_size=0.01,
curve_type='kde', # override default 'kde'
colors=colors,show_rug=False,show_hist=True)
for l in range(0,4):
print("Mean : %f"%val_stat[1,l,1]," $+/- std dev : %f"%val_stat[1,l,2])
fig.add_trace(
go.Scatter(
x=[val_stat[1,l,1], val_stat[1,l,1]],
y=[0., 8],
name="Mean : %s "%labels[l],
mode="lines",
line=go.scatter.Line(color=colors[l],dash="dot"),
showlegend=True)
)
# Add Std deviation
fig.update_layout(
shapes=[
# filled Rect Std dev prof 1
go.layout.Shape(type="rect",
x0=val_stat[1,0,1]-val_stat[1,0,2],
y0=0,
x1=val_stat[1,0,1]+val_stat[1,0,2],
y1=8,
name="Std dev : %s "%labels[0],line=dict(color=colors[0],width=0),fillcolor=colors[0],
opacity=0.2,layer="below"),
# filled Rect Std dev prof 2
go.layout.Shape(type="rect",
x0=val_stat[1,1,1]-val_stat[1,1,2],
y0=0,
x1=val_stat[1,1,1]+val_stat[1,1,2],
y1=8,
name="Std dev : %s "%labels[1],line=dict(color=colors[1],width=0),fillcolor=colors[1],
opacity=0.2,layer="below"),
# filled Rect Std dev prof 3
go.layout.Shape(type="rect",
x0=val_stat[1,2,1]-val_stat[1,2,2],
y0=0,
x1=val_stat[1,2,1]+val_stat[1,2,2],
y1=8,
name="Std dev : %s "%labels[2],line=dict(color=colors[2],width=0),fillcolor=colors[2],
opacity=0.2,layer="below"),
# filled Rect Std dev prof 4
go.layout.Shape(type="rect",
x0=val_stat[1,3,1]-val_stat[1,3,2],
y0=0,
x1=val_stat[1,3,1]+val_stat[1,3,2],
y1=8,
name="Std dev : %s "%labels[3],line=dict(color=colors[3],width=0),fillcolor=colors[3],
opacity=0.2,layer="below"),
]
)
# Add title
fig.update_layout(title_text='KDE distribution for Production rate')
fig.show()
# Create distplot with curve_type set to 'normal'
fig = ff.create_distplot([df_p2['Time'],df_p3['Time'],df_p4['Time'],df_p5['Time']],labels, bin_size=10000.,
curve_type='kde', # override default 'kde'
colors=colors,show_rug=False,show_hist=True)
# Add title
fig.update_layout(title_text='KDE distribution for Final time')
fig.show()
import pandas as pd
from plotly import plotly
from plotly import tools
import plotly.graph_objs as go
from plotly.offline import init_notebook_mode, plot, iplot
init_notebook_mode()
import numpy as np
prof= pd.read_csv('zone_2.csv',names=['ID_BEE','Fobj','Time','P','k238','f238','k234','f234','k(234/238)','f(234/238)','k238/f238','k234/f234'],encoding='utf-8',delimiter="\t",dtype=np.float64)
df= pd.DataFrame(prof,columns=['Fobj','Time','P','k238','f238','k234','f234','k(234/238)','f(234/238)'])
fig = go.Figure(data=[go.Scatter3d(x=df['P'], y=df['k238'], z=df['f238'],mode='markers',marker=dict(
size=1,
color=df['Fobj'], # set color to an array/list of desired values
colorscale='Jet', # choose a colorscale
colorbar=dict(
title="Fobj"
),
opacity=0.8))])
iplot(fig, filename = 'graph_3.html')
import plotly.express as px
fig = px.scatter_3d(df_p2,
x='P',
y='k238',
z='f238',
color='Fobj',
size='Fobj',
size_max=10,
color_continuous_scale=px.colors.sequential.Jet)
fig.show()
fig = px.scatter_matrix(df_p2,
dimensions=['Time','P','k238','f238','Fobj','Fobj'],
color='Fobj',
size='Fobj',
size_max=2.5,opacity=0.5,
color_continuous_scale=px.colors.sequential.Jet)
fig.show()